Thermal Treatment of the Minority Game 
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We study a cost function for the aggregate behavior of all the agents involved in the Minority 
Game (MG) or the Bar Attendance Model (BAM). The cost function allows to define a deterministic, 
synchronous dynamics that yields results that have the main relevant features than those of the 
probabilistic, sequential dynamics used for the MG or the BAM. We define a temperature through 
a Langevin approach in terms of the fluctuations of the average attendance. We prove that the cost 
function is an extensive quantity that can play the role of an internal energy of the many agent 
system while the temperature so defined is an intensive parameter. We compare the results of the 
thermal perturbation to the deterministic dynamics and prove that they agree with those obtained 
with the MG or BAM in the limit of very low temperature. 
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I. INTRODUCTION 

The Bar Attendance Model (BAM) and the Minor- 
ity Game (MG) (see Refs. || - ||) have recently became 
regular testing grounds to investigate how the individ- 
ual actions of a system of independent agents give rise 
to some kind of macroscopic ordering. In the MG, the 
agents have to make a binary decision which for the sake 
of concreteness, it is usually taken to be associated to 
going or not going to a bar. The winning option is that 
of the minority. The MG is a particular case of the BAM 
which has in turn been introduced to show how an en- 
semble of agents that perform inductive reasoning can 
self organize to match some condition that is generally 
accepted to be the most adequate. In the case of the 
BAM this corresponds to the largest acceptable atten- 
dance without incurring in some discomfort. 

Both models have been compared with each other in 
Refs. (6J and J7) working out a generalized version of the 
MG (the GMG)in order to consider situations in which 
the minority is replaced by an arbitrary fraction fi of 
the ensemble of players. This is fixed externally as a 
control parameter. In all these models the players up- 
date their attendance probabilities with a random cor- 
rection, depending upon the past record of successes and 
failures. Asymptotic stable configurations are always 
reached. These are, however, of quite different nature 
depending upon the values of the control parameters, of 
the initial conditions and on the updating rules involved 
in each model. 

In the present work we are interested in the cases in 
which the asymptotic stable distribution can be assimi- 



lated to a kind of thermodynamic equilibrium. In these 
situations the agents continue to update their attendance 
probabilities but the corresponding probability density 
distribution remains stationary. The stochastic dynam- 
ics that has been developed for the BAM in ref. |7J always 
leads the system to these type of configurations while in 
the cases studied for the GMG, when /x is significantly 
larger (or smaller) than 1/2, the system gets stuck in 
quenched configurations that strongly depend upon the 
initial conditions. Updating stops because agents have 
accumulated a great number of successes. However, these 
"glassy" states can nevertheless be "melted" into equilib- 
rium if the memory of past successes is repeatedly elim- 
inated in an iterative process that can be assimilated to 
an annealing procedure. 

A remarkable result that has been obtained in all nu- 
merical simulations is that the equilibrium configuration 
entails a diversity in the individual actions. The popula- 
tion is drastically partitioned into two subsets, one that 
always goes to the bar and the other that never goes. It 
therefore seems that in spite of the fact that the agents 
do not exchange information, they manage to coordinate 
their actions to proceed in two opposite ways. The num- 
ber of agents in both subsets are in a ratio that is equal 
to /x/(l — jj,). Such polarization is not an intuitive result. 
A naive guess is to assume that all agents should choose 
the same probability of attendance and this should be 
equal to /i. However this turns out to be not a stable dis- 
tribution because parties that are larger or smaller than 
the accepted crowding occur with a great chance. 

The fact that all agents adjust their attendance proba- 
bilities in order to minimize their failures (i.e. to go when 
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the bar is crowded or not go when the bar is empty) leads 
to an aggregate behavior that minimizes a global cost as- 
sociated to inadequate attendances. We propose to ex- 
press such cost by the second moment of the attendance 
with respect to the acceptable level /i. 

The purpose of the present paper is to investigate the 
effects of introducing that cost function in the relaxation 
dynamics of the system. We show that this is a Lya- 
punov function for the many agent system, i.e. it is pos- 
sible to derive a deterministic dynamics as the descent 
along its gradient, that monotonically reduces its value. 
This corresponds to a heavily coordinated, synchronous 
evolution. 

We prove that the cost function meets the requirements 
of an internal energy of the many agent system. We also 
introduce a temperature parameter through a Langevin- 
like approach that can be defined in terms of the fluctu- 
ations of the attendance strategies. Except for finite size 
effects this can be proven to be an intensive parameter. 
We also superimpose thermal fluctuations to the deter- 
ministic dynamics mentioned above. Depending upon 
the amplitude of these fluctuations, the polarization is 
gradually smeared until a point in which completely dis- 
appears. 

The thermally modified, relaxation process that we de- 
fine here is completely different from those involved in the 
GMG or BAM approaches that involve the independent 
and uncoordinated actions of all the agents. The lat- 
ter involves a random updating of individual attendance 
strategies governed by a (small) uncertainty amplitude 
that is interpreted as the precision of such updating. We 
prove that in the limit of low temperature, and small 
uncertainty amplitude both dynamics lead to entirely 
equivalent asymptotic equilibrium configurations. The 
thermal interpretation of the uncertainty amplitude also 
allows to cast the annealing process presented in Refs. |q] 
and [|7| into a thermal framework as the well known case 
of simulated annealing Q . 

In section II we derive the cost function, and in section 
III we investigate the dynamics that corresponds to the 
descent along its gradient. In section IV we present a 
Langevin approach to define the temperature in terms of 
the fluctuations that are present in the asymptotic equi- 
librium configuration. In Sec. V we compare this with 
more traditional approaches for the relaxation process. 
In section VI we draw the conclusions. 



II. THE COST FUNCTION 

Consider a set of N agents that have a probability 
Pi(i = 1, 2, . . . , N) to go to the bar. The distribution of 
the p^s is given by the probability density function P(p). 
As we shall shortly explain the pi are updated in time 
according to some dynamics and therefore the function 
P(p) also changes in time. 

In the ordinary rules of the GMG when a player goes 



to the bar and finds it is crowded or when she does not 
go and the bar is empty, loses a point. If the opposite 
happens she gains a point. The level of crowding is spec- 
ified by the value of the control parameter fi. When her 
account of points falls below zero she updates her atten- 
dance probability choosing at random a different value 
within the interval (j>i — Sp/2,pi + Sp/2). When equilib- 
rium is reached, the resulting distribution P(p) concen- 
trates the population in the immediate neighborhood of 
p ~ and p ~ I, plus an almost vanishing contribution 
from intermediate values. The ratio of the areas below 
these two peaks is close to /i/(l — /i). 

The aggregate behavior is associated to the density dis- 
tribution V(A) that gives the probability of occurrence 
of a party of A customers attending the bar. The func- 
tion V(A) is of course completely determined by P(ja). In 
order to calculate it let us assume without loss of gener- 
ality that all the agents distribute themselves into D + 1 
different bins of rid{d = 0,1,..., D) agents each, with 
strategies pa = d/D. The density distribution P(p) can 
then be written as: 
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With this assumption, the distribution V(A) can be writ- 
ten as: 
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(2) 



We define the cost function for the whole ensemble of 
agents as in ref. pj, namely as the second moment M 
with respect to the tolerated crowding level /x: 

N 



C = Y J {A-N t ifV{A) 



(3) 



A=0 



In order to calculate it, we introduce Eq. (||) into the 
definition of Eq. (0) and perform first the summation over 
A taking advantage of the 5{A — ^2 d id). Once this is 
done, one can perform the summations involved in each 
of the terms in which (N/i — ^2 id) 2 splits down. The 
summations over different €s decouple from each other 
and result either in a 1; or in ndPd', in n^p^ + ndPdi^ —pd) 
or in (n c iPd)(n c i'Pd 1 )- These terms can be gathered again 
to yield: 



C = (Nfi- 
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= N 2 (n- <p>) 2 +N(<p> - <p 2 >) (4) 

where < p m > stands for ^2 p p m P(p) — ^2dPd n " n d/^ f° r 
m = 1,2. The expression of C given in Eq. (0) contains 



no assumption about the system being in equilibrium. 
This is the reason why C is proportional to iV 2 instead 
of being proportional to the size N of the system, as 
befits to an extensive magnitude. The numerical simu- 
lations however indicate that in equilibrium < p >= fi 
and therefore this term cancels except for possible fluc- 
tuations. Actually the 0(N 2 ) term is eliminated by any 
distribution P(p) whose mean has the required value /x. 
For an initial condition with uniformly distributed p^s 
and P (p) = 1/N, as it is used for most simulations, the 
cost is C = N 2 (fi - 1/2) 2 + N/6. Such initial condition 
is a good guess for the final distribution when \x ~ 1/2 
(as for the most traditional settings of the MG), but it 
is indeed very poor for the GMG when \i ^ 1/2. In the 
next sections we discuss in greater detail the value of C 
in equilibrium. 

The naive guess P{p) = 8(p — fi) is also seen to cancel 
the 0(N 2 ) terms in C. However such distribution causes 
that parties with A close to, but different from Nfi occur 
with a sizable probability. The O(N) in C are minimized 
precisely when the probability of occurrence of such par- 
ties tends to zero by polarizing the population into two 
subsets with opposite attendance strategies. To see this 
we approximate the two peaked equilibrium distribution 
that is usually obtained in numerical simulations by 



P (P) = Jf$(p - Pi) + jjS(p - Pa) 



(5) 



One readily sees that the 0(N 2 ) terms are eliminated 
when nipi + n 2 p2 = fJ-N and the O(N) terms are also 
eliminated if the two peaks are p\ =■ 0; n\ = N(\ — /a) 
and pi = l;7i2 = \xN . The relaxation dynamics that 
tends to minimize individual losses is therefore seen to 
also optimize the global cost function defined in Eqs. (g) 
and (|) 



III. A DETERMINISTIC DYNAMICS FOR THE 
GMG 

All the agents of the system, through uncoordinated 
actions minimize the total cost C that is an aggregate 
function defined for the whole system. This fact suggests 
an alternative representation of the actions of the agents 
as a synchronous, deterministic dynamics associated to 
the descent along the gradient of C. This is described by 
the following set of coupled differential equations for the 
Pi's: 

dn- BC 

^ = "^ = »?[2JV(M-<P >)-(!- 2 W )] (6) 

In Eq. (g) r\ stands for a positive free parameter that 
- as we shall shortly see - provides the scale for the time 
evolution of the system. The 0(N 2 ) and O(N) terms in 
Eq. (||) are translated into a fast and a slow dynamics 
that involve corrections of the pi that are respectively 
O(N) and 0(1). To see this we first derive the dynamics 



followed by < p > by calculating the average over i in 
both sides of Eq g. We thus obtain: 

^M = -2r,(N l)W(t) 2„(i - M ) (7) 

where we have set W(t) = (< p > — fi). This can explic- 
itly be integrated. The solution is: 



W {t) = ILJ^l + Woe -MN-i)t 



(8) 



with W standing for the initial value of Wit). This ex- 
pression allows in turn to find an approximate solution 
of the equations of motion for the individual PiS. To this 
end we write an asymptotic approximation of Eq. (0) in 
which we assume that a long enough time has elapsed 
so that < p > — jU can be approximated by the constant 
term of 0(1 /N) in Eq. (j|). By keeping only the leading 
order in N we obtain: 



-£ = ZviPi - m)- 



(9) 



Note that dependence of pi (t) involves a positive expo- 
nential. However, this equation is not valid for t — > oo 
because the fact that the p^s are probabilities, and are 
therefore bounded between and 1, it is not included in 
the equations but rather in the boundary conditions of 
Eqs. m. 

EqsyM) and (ft) correspond respectively to the fast and 
slow dynamics that have been mentioned above. In the 
first place we see that except for terms that are 0(1/N), 

< p > approaches /j, exponentially with the very short 
time constant A = l/(2r]N) that tends to zero as the 
system involves a larger number of individuals. On the 
other hand, the differences pt(t) — /i instead grow expo- 
nentially for all i indicating that the pi 's depart exponen- 
tially from the average \i and eventually saturate at its 
largest or smallest possible values: 1 or 0, thus polariz- 
ing the population of agents. This process however takes 
place with a time constant l/(2rj), that is O(N) longer 
than the one involved in the evolution of < p > and is 
independent of the size of the system. While the average 

< p > approaches very fast to the value fj,, the individual 
Pi's depart slowly from the same value. 

Eqs.(||) can be tested numerically by approximating 
them by finite differences. The individual attendance 
probabilities pi are thus taken to be updated as pi(t+l) = 
Pi(t) + A(pi) where : 



AG?*) = r,[2N(fi- <p >)-(!- 2 Pi )] 



(10) 



The resulting density distributions P(p) that are ob- 
tained with this dynamics are shown in Fig. nL The 
value of rj and therefore that of the time constant A is 
in principle arbitrary. However if A 3> 1 the only effects 
that are noticeable are those of the fast dynamics while if 
A<1 the descent towards the minimum keeps bouncing 
at opposite sides of the quadratic well and never reaches 



its bottom. When 1/2 ~ A ~ 2 the descent is gradual 
enough so that the interplay of both terms in A(pi) leads 
the system to a minimum of C. 
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FIG. 1. Probability density distributions obtained after 10 
steps (solid line), 2 x 10 4 steps (open circles) and 2 x 10 6 steps 
(dash line), using Eq.(|l0j), 2rjN = 1, and fi = 0.6. the first 
distribution shows a rigid displacement to the right; the next 
ones show how the population is progressively polarized. 

The intermediate stages in the gradient descent are 
also shown in Fig. [j]. In the first few steps the (fast) uni- 
form correction of O(N) is seen to shift rigidly the initial 
distribution to one side with the aim of adjusting the 
value of < p > to that of fi. As a consequence, agents 
are piled up in one end while the other is completely 
cleared. Once the leading term in C is nearly canceled, 
the slow dynamics gradually gathers agents at both ends 
of the distribution producing minor fluctuations in the 
value of < p >. The density distribution P(p) that is 
finally obtained is seen to correspond to a strongly po- 
larized population thus reproducing the main feature of 
the equilibrium distributions obtained with the rules tra- 
ditionally used in the GMG or the BAM. 

The present approach yields a density distribution that 
displays the same polarization that is found in the GMG 
or in the BAM. It is remarkable that such a general qual- 
itative agreement is found, although those frameworks 
differ deeply from the deterministic formulation. The 
conceptual difference between the two approaches lies in 
the special role played by the record of successes and fail- 
ures that is kept in the BAM or GMG and that is com- 
pletely absent in the present treatment. The usual rules 
of the GMG can thus be considered to correspond to a 
dynamics constrained by the (positive) balance of points 
that have been accumulated in the past instances of the 
game. There are other differences that deserve further 
discussion. These are related to the stochastic elements 
of the dynamics used in that framework which are absent 
from the present one. Within this approach, these can 
be assimilated to the effects of a finite temperature. We 
turn to this point in the next section. 



IV. THERMAL FLUCTUATIONS 

The usual rules of the BAM or the GMG involve a 
stochastic updating of the attendance probabilities of 
each customer. When the account of points of the i— th 
player falls below zero a new value of pi is chosen at ran- 
dom from the interval (pi — 5p/2,pi + Sp/2). This can be 
interpreted as a kind of thermal fluctuation in which dp 
can be related to the temperature. 

A few qualitative features support this. In equilib- 
rium, the population is drastically polarized into those 
that consistently go to the bar (and therefore pt = 1) 
and those that do not go (pi — 0). A small fraction hav- 
ing p^s with intermediate values continuously migrate 
between both extreme strategies. This migration causes 
that the value of < p > fluctuates around \i. These ran- 
dom values of < p > have a distribution that is sharply 
peaked at that value and has a width that is regulated 
by Sp. In what regards the density distribution P(p), 
a small value of Sp produces sharp peaks at p = and 
p = 1 and P{p) ~ for intermediate values. For larger 
values of 5p there is a larger fraction of players that mi- 
grate between p — and p = 1 thus producing a rising 
in the "bottom" of the distribution P(p). 

The above qualitative arguments provide hints to in- 
troduce thermal fluctuations in the deterministic dynam- 
ics presented in the preceeding section and also about 
their relationship with Sp for the case of the GMG. How- 
ever a singular situation occurs for Sp — » that is as- 
sociated to an infinitely long relaxation process or when 
Sp > 1 in which this parameter loses its physical meaning 
of a being a probability. 

Thermal-like fluctuations can formally be introduced 
following the same steps as the Langevin approach to 
describe a Brownian particle. In the present situation we 
start with the Eq. (0) for the motion of the average value 
< p >, and we add a stochastic term L(t) that accounts 
for the random fluctuations 

^^ = -2 V (N l)W s (t) 2n{\ M) + L{t) (11) 

We have added an index s to W{t) in Eq. (pi) to stress 
the fact that this is the value of W(t) in the presence 
of stochastic external fluctuations. The source of noise 
L(t) can be taken to be the average of N uncorrelated 
sources of random fluctuations affecting all the indepen- 
dent agents. One still has to specify a parameter re- 
lated to the statistical properties of the distribution of 
the stochastic function L(t). We will shortly prove that 
this is closely related to the temperature. As usual we 



L{t) = 



L{t)L{t') = TS(t - t') 



(12) 
(13) 



In Eqs. (fla) and dl3) and in all what follows (...) de- 
notes an average over a suitable ensemble of replicas of 



the N— agent system. The parameter T is a constant 
that represents the mean square amplitude of instanta- 
neous, uncorrected perturbations. The stochastic dif- 
ferential equation (^Tj) can explicitly be integrated. The 
result is 



the parameter Sp of the GMG. This is due the relation 
between T and Sp that we discuss later. 



W s (t) = W(t)+e" 2, ^ N -^ t e 2r,( - N -^ w L(u;)du> (14) 



where W(i) is the solution given in Eq(|8j) in which no 
fluctuations are present. If an average is made on both 
sides of Eq. (H), over a sub-ensemble of systems hav- 
ing the same initial conditions W appearing in Eq. (g), 
one can immediately see that Eq. (12) implies that 



W s (t) — W(t) and therefore the convergence of < p > to 
\x (up to terms 0(1/N) is also insured within the stochas- 
tic dynamics. If the mean square fluctuations of W s (t) 
are calculated with the aid of Eq. (fl3|), we get: 



wm = w\t) + 



A-qN 



I _ e -4r)Wt 



(15) 



The effect of the stochastic term in W s (t) produces a 
non vanishing value W^oo). In ordinary statistical me- 
chanics, the mean square fluctuations of the stationary 
solution of the velocity of Brownian particles is directly 
related to its average kinetic energy and can be set equal 
to kT. By analogy we formally define a temperature 
parameter T that is independent from the size of the 
system, as the mean square fluctuations of < p > in an 
equilibrium configuration, scaled by the number of agents 
of the system. Neglecting terms 0(1/N 2 ) we obtain: 



T = N(< p > -/,)* = L 

47/ 



(16) 



The parameter rj is a factor relating T with the ampli- 
tude of the random fluctuations and plays a similar role 
than the Boltzmann constant. 

Eq. ( jig ) allows to write the ensemble average of the 
cost C for an equilibrium configuration and for finite tem- 
perature. Up to the leading order in N we obtain: 



C = N z {p- <p>) 2 + N{< p> - <p 2 >) 

= N[T + ii-<p 2 >} 



(17) 



C is a positive, extensive magnitude which, in equilib- 
rium, grows linearly with the size of the system and can 
therefore be taken to play the role of an internal energy. 
The linear dependence of C with the size of the system 
can be checked for the GMG. To do so we have calcu- 
lated the cost using the definition of Eq. (J3|) , with different 
number of agents. We first allowed the system to relax to 
the asymptotic equilibrium configuration and performed 
a suitable ensemble average over several replicas of the 
system. The linear dependence is shown in Fig. @. The 
last iteration steps are used to estimate the dispersion 
of the numerical result and is shown with a pair of dot- 
ted lines. The slope of these lines change slightly with 
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FIG. 2. Linear dependence of C as a function of N for the 
GMG, fi = 0.6 and different values of Sp 



V. THERMAL RELAXATION 



To include thermal fluctuations into a numerical treat- 
ment of the deterministic dynamics amounts only to in- 
troduce a random additive term in Eq.(lOJ), namely: 



Pi(t + l)=Pi(t) + &(pi)+L®{t) 



(18) 



where L T = t(1/2 — r) and r is a random number uni- 
formly distributed in the interval [0,1]. This function 
represents the fluctuations produced on the i-th agent 
by a thermal bath. The temperature is defined by the 
second moment T of the probability density of the Lr (t) . 

The limit in which L T (t) has zero width (and there- 
fore t = 0) corresponds to the deterministic dynamics 
discussed in Sec. II. Larger values of r are associated to 
fluctuations that may eventually override the updating 
amplitude A(pj) and tend to smear the distribution with 
two sharp (5-functions, increasing the fraction of thepop- 
ulation that have strategies pi ^ or 1. (see Fig. J3|(a)). 
If r is further increased the polarization is progressively 
destroyed because the drift of the p^s towards or 1 has 
to equilibrate against random shocks that prevent them 
to reach those limiting values. 
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we plot NWg (oo) as a function of Sp. 

All the above mentioned features can be extracted from 
Fig. H. Firstly finite size effects are clearly seen to affect 
only the smallest systems up to N ~ 500. Second the 
independence of NW 2 (oo) from the size of the system as 
assumed in the definition of Eq.(16) follows from the fact 
that the curves for N > 500 lump tightly together. In the 
third place a linear regression of all the curves establishes 
that Sp and T have the same physical interpretations, and 
within the interval considered arc nearly proportional to 
each other, namely T = K Sp, with K = (320 ± 20)10~ 4 . 
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FIG. 3. (a) Probability density distribution obtained with 
the thermal dynamic of Eq.(|lq) for the values of r that are 
shown in the inset, (b) Same distributions obtained with the 
stochastic dynamics of the GMG, for the values of Sp shown 
in the inset. 

Given the stochastic dynamics of Eq. (fi~|) together with 
the definition in Eq.(y_6|) it is possible to calculate the 
value of T in an equilibrium configuration, and relate T 
with r. The parameter r\ has to be chosen such that the 
relaxation of the deterministic dynamics is guaranteed 
i.e. when the time constant A = l/(2r]N) introduced in 
Sect. 3 is A ~ 1. In Fig. we show that, as expected, 
T~t 2 . 

Eq. (pil) allows also to calculate T in any configuration 
reached through the stochastic dynamics of the GMG or 
the BAM. With this we can check two important features. 
The first is an estimation of the finite size corrections in 
the definition of T given in Eq. (|16|) , i.e. the regime in 
which T is independent of the size of the system. The 
second outcome is to establish a quantitative relationship 
between T and Sp that goes into the relaxation dynamics 
of the GMG. 

We have calculated IF?(£) for the GMG using several 
values of Sp and N. We have allowed t to be large enough 
to reach equilibrium. We have then performed an ensem- 
ble average over several replicas of the system. The last 
steps have been used to gauge the dispersion of the nu- 
merical values. The results are shown in Fig.0(6) where 
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FIG. 4. (a) Relationship between r and T as defined 
in Eq. (Jlfij). Solid squares correspond to the numeri- 
cal calculation, while the line is the quadratic regression 
10 4 T = 5.43KT 5 - .114t + 703. 9r 2 , with R 2 = 0.9999. (b) 
linear dependence of the fluctuations NWi with Sp < 1 for 
the GMG, and several values of JV (indicated in the figure). 
The upper inset shows that fluctuations saturate at a limiting 
value ~ .05 if the plot is extended for Sp > 1 

The fact that T and Sp are conceptually equivalent 
leads to extend the GMG simulations to higher values of 
Sp. These values have seldom been explored H] in the 
literature because this parameter measures the minor ad- 
justments performed by the agents that try to find the 
"best" attendance probability. Large values of Sp could 
for instance correspond to irresolute or hesitating agents. 

There are however important points that have to be 



considered. In the first place the value of Sp can not be 
taken arbitrarily large. This is so because it measures the 

uncertainty of the value of a probability. Values of Sp ~ 1 
have therefore little physical meaning. In addition, if Sp 
is nevertheless extended to values higher than 1 by any 
plausible analytical extension (for instance using periodic 
or reflective boundary conditions), the fluctuations W% 
for Sp > 1 are seen to saturate at an approximately con- 
stant value (see inset in Fig. ||(&)). These facts cause 
that the correspondence between Sp and T necessarily 
breaks down. 



A comparison of the probability density distributions 
P(p) obtained with both approaches further supports this 
departure. In Fig. 0(6) we show the equilibrium density 
distributions that are obtained with the stochastic, asyn- 
chronous updating rules of the GMG for two values of Sp 
(and u = 0.6). It is seen that these diverge from those of 
Fig. 0(a) that are obtained with the dynamics given in 
Eq. (|l8|) . Note however that there are noticeable ressem- 
blances for small amplitude fluctuations. See for instance 
the distributions plotted in full line in Fig||(b) and the 
one for r = 0.003 in Fig.|(a). 



As mentioned before, the origin of the departure be- 
tween both dynamics can be found in the scoring of suc- 
cesses and failures that is used in the GMG, that is absent 
in the present approach. Some customers can be consid- 
ered to be excluded from the updating dynamics as a 
consequence of their great accumulation of points. This, 
for instance, produces the large value of P(p = 1): many 
players that have accumulated a large positive account 
attending the bar do not change strategy. The scoring of 
each player works as a kind of "Maxwell Deamon" that 
classifies agents into different groups, endowing each one 
with a different updating rate. 



The equilibrium configuration that is reached in the 
GMG therefore entails a distribution of updating rates 
in which some players are essentially frozen while others 
modify their attendance strategies frequently. This situ- 
ation is completely different to the one obtained with the 
dynamics of Eq. (fi~8|) in which all agents undergo stochas- 
tic perturbations in every time step. 



In order to show this we present in Fig. g some re- 
sults of the GMG, in which we have used a large value 
of Sp (Sp = 0.8) and we have arbitrarily partitioned the 
ensemble of 1001 players into two sets. One of the sets 
gathers all players having at most 10 points the other 
contains all the rest. We have plotted their respective 
density distributions P(p). The agents having less that 
11 points are the ones that participate more strongly in 
the dynamics because undergo more frequent updatings. 




FIG. 5. Partial probability density distributions of individ- 
ual attendance strategies for the GMG for different subsets of 
players obtained for 1001 players, crowding level of 600/1001, 
and averages made over 2000 histories, (a) Asymptotic dis- 
tributions. Subset of players with more than 10 accumulated 
points (full line) and with less than 11 points (dash line). The 
total probability density distribution is shown with empty 
boxes, (b) Density distributions at the end of the first 10 
steps of the simulation. Players with points (open boxes) 
have the greatest mobility, players with 5 and 10 points (full 
and dash lines respectively) have lower mobility. The total 
density distribution is shown in full triangles 

The above comparison indicates that the GMG and 
the thermal relaxation dynamics of Eq.(18) strictly co- 
incide only in the limit of T — > 0. However the strong 
qualitative resemblance of the results for Sp < .6 allows 
to interpret Sp, with these limitations, as equivalent to a 
thermal fluctuation. 

The thermal interpretation of Sp has one interesting 
consequence. The most remarkable feature of the relax- 
ation processes of the GMG performed with large Sp is 
that the high fluctuations prevents quenching (see Fig. 
0). This allows to provide a new framework to the an- 
nealing procedure presented in Refs. || and |7J that re- 
sembles more closely the traditional protocol of Ref. || . 

The method presented in Ref. [pf requires an iterative 
procedure which involves a short evolution of the N— 
agent system and the subsequent elimination of all points 
accumulated in the system. This is repeated until a mo- 



ment in which the distribution P(p) remains stationary. 
With the present interpretation of Sp, a thermal anneal- 
ing relaxation for the GMG can be performed for the 
cases in which [i is significantly different from 1/2. This 
new protocol can be assumed to take place in episodes. In 
the first episode, relaxation is allowed using a value of Sp 
that is large enough to insure that equilibrium is reached 
and quenching is prevented. The following episodes start 
from the equilibrium reached in the preceeding one, and 
a new relaxation process is allowed with a smaller value 
of Sp that is still large enough to avoid the appearance of 
quenching. The process continues until a lower bound of 
Sp is reached. Following this "cooling" protocol quench- 
ing never occurs, an absolute minimum of C is obtained 
and the population remains strongly polarized. 




FIG. 6. Asymptotic probability density distributions of in- 
dividual attendance strategies for the GMG obtained with the 
values of Sp that are shown in the inset. Notice that for the 
highest value of Sp there is no quenching. 



VI. CONCLUSIONS 

In the present paper we provide and alternative de- 
scription of the dynamics of a system composed by many 
agents that play at the GMG. This is given in terms of 
the optimization of a single global magnitude, instead of 
doing it in terms independent actions of the N agents. 
We do this by studying the effect of introducing a cost 
function C that is associated to the second moment of 
the probability distribution of the size of the attending 
parties. 

We have proven that C has the relevant properties of 
an internal energy. In equilibrium, it is a positive ex- 
tensive quantity that scales linearly with the number of 
agents N and its minima correspond to equilibrium con- 
figurations with a highly polarized population, as found 
in the BAM or the GMG without quenching. 

In addition, the deterministic dynamics that is derived 
from the descent along the gradient of C leads the system 
to configurations that have an equivalent polarization as 
that found with the traditional stochastic updating of the 
BAM or the GMG. This is a non trivial equivalence be- 



tween two completely different organization schemes of 
the N-agent system. On the one hand the gradient de- 
scent gives rise to a set of coupled differential equations 
that represents a coordinated evolution of all the agents 
as would be the result of the action of a "central plan- 
ner" of the whole system. On the other hand, within the 
GMG all the agents act independently from each other 
adjusting their attendance strategies with the purpose 
of optimizing their individual utilities. Even though the 
two relaxation mechanisms are very different, the final 
configurations of the system turn out to have equivalent 
features. 

The definition of C in terms of the second moment of 
the probability distribution of attending parties is remi- 
niscent of the many body Hamiltonian introduced in Rcf. 
]13[ to cast a version of the MG into the spin glass for- 
malism. In the present case C can also be considered 
as a many body Hamiltonian with one- and two-body 
interactions in which the N dynamic variables are the 
attendance probabilities p^s, with i — 1,2 ... N. 

The introduction of C and the associated relaxation 
process allows to define a temperature parameter through 
a Langevin-like approach. The value of T remains associ- 
ated to the ensemble average of the square of the fluctu- 
ations of the attendance, scaled by the number of agents. 
Its introduction in C provides the proof that this quan- 
tity, in thermal equilibrium, scales linearly with the size 
N of the system and therefore qualifies as an extensive 
parameter. 

On the other hand, in order to be an intensive param- 
eter, T should be independent of the size of the system. 
This has been checked numerically for the case of the 
GMG. However finite size effects in the definition of T 
become negligible only for systems that are significantly 
larger than the minimal ones that already display the self 
organization features and that have spurred the popular- 
ity of the Minority Game. 

Thermal fluctuations can be included in the dynam- 
ics that corresponds to the descent along the gradient 
of C. The corresponding distributions P(p) can readily 
be found and a comparison can be made of T with Sp 
involved in the relaxation of the GMG or the BAM. A 
direct relationship can be established between both pa- 
rameters but only in the limit of Sp — ► 0. We have also 
considered the dynamics of the GMG with moderately 
large values of Sp when still the divergence between the 
GMG and the thermal dynamics is not important. A 
stochastic updating that involves large values of Sp could 
be thought to be associated to irresolute or badly in- 
formed agents that correct their attendance probabilities 
performing significant changes in each correction. 

The GMG relaxation for large values of Sp avoids 
quenching even for /i significantly different from 1/2. 
This fact, together with the thermal interpretation of 
Sp allows to cast the annealing procedure presented in 
Ref. |6| into the more traditional framework in which T 
is progressively reduced in successive epochs. This "cool- 
ing" protocol could well be assimilated to a succession of 



learning episodes of the many agent system. In the first 
episodes in which agents have little "experience" and the 
information about the past is scarce, all agents perform 
large amplitude - even random - corrections. In the last 
episodes of the relaxation process, as there is a richer in- 
formation about the past history of the system the agents 
perform finer corrections, the fluctuations are smaller and 
the cost paid by a wrong attendance are also smaller. 

The fact that on the one hand an extensive magni- 
tude can be defined playing the role of an internal en- 
ergy, and that on the other, a microscopic definition of 
the temperature can be made, opens the way to a the 
full thermodynamic description of a system of N-agent 
performing a GMG. This amount to introduce a Gibbs 



distribution defined as $(C) 



-C/T 



/Z, where Z stands 



for the partition function, 
should follow from this. 



All thermodynamic functions 
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